Sublist Shared Cache Improvements #726

derekcollison · 2018-08-26T00:54:31Z

Sublists have a caching layer that is shared. Most of the extreme performance in NATS comes from the client L1 cache of the sublist which has no contention. This works well as long as the sublist has a fairly low rate of change. When highly concurrent reads to the sublist's cache were present performance suffered.

Also #710 showed a case where the shared cache made things worse.

This PR alleviates some of those issues by being smarter when having a literal subject to add or remove from the shared cache.

It also uses Go 1.9 sync.Map which performs better under highly concurrent conditions with many cores.

/cc @nats-io/core

Signed-off-by: Derek Collison <[email protected]>

coveralls · 2018-08-26T01:00:20Z

Coverage increased (+0.1%) to 92.122% when pulling 2c4b7e7 on sublist_cache into 34b556d on master.

kozlovic

One if the Range() is returning false to stop the iteration and not sure that should be the case.

kozlovic · 2018-08-27T15:46:58Z

server/sublist.go

+		r := v.(*SublistResult)
+		if matchLiteral(key, subject) {
+			s.cache.Store(key, r.addSubToResult(sub))
+			return false


Return false will stop the iteration. That's not what the original code was doing.

Correct, will fix. I will add test too.

I added separate test, but could add this into the larger cache test if you think that is better.

I moved it into larger cache test and also double checked that it failed without the change.

kozlovic · 2018-08-27T15:53:50Z

server/sublist_test.go

+	checkBool(subjectIsLiteral("foo.*.>"), false, t)
+	checkBool(subjectIsLiteral("foo.*.bar"), false, t)
+	checkBool(subjectIsLiteral("foo.bar.>"), false, t)
+}


I think we had changes in the not so distant past that made say foo*.bar a literal. You may want to add a test that we indeed consider * and > that are not token of their own not wildcards.

Signed-off-by: Derek Collison <[email protected]>

derekcollison · 2018-08-27T18:57:23Z

Let me make sure we are green then I will merge.

gwik

Thanks for addressing this issue.

I think this should clear the contention issues I expressed in #710 for the most part since the write lock is no longer held on matching and the sweeping is async.
However, I still feel both the server's "client" cache and sublist cache are inefficient in this use case and I'd rather disable them entirely.
I will run my experiment again and compare with the no cache version I have and let you know. I will also run the benchmark suite, the contention benchmarks are a good addition :)

I think there may be a race in the cache handling.

gwik · 2018-08-27T22:30:01Z

server/sublist.go

-			delete(s.cache, k)
-			break
-		}
+	s.cache.Store(subject, result)


I think this introduces a race. For example, if a client subscribes between the unlock and the call to s.cache.Store it will be shadowed and won't be delivered messages for this subscription until the cache entry gets cleared.
Also for queue subscriptions a client could have removed a subscription (but still be connected) and be delivered messages while no longer processing it and would "steal" it from other valid subscriptions of the same queue.

The read lock should be held while calling s.cache.Store to ensure that the sublist won't be modified concurrently.
There may be similar issues with Load, although I don't see one at the moment holding the read lock is probably a good idea too.

exactly why I disliked that Go added sync.Map. It "introduces" these kind of issues.

Same class of problems may happen with atomic operations but they are still useful. Concurrency is hard to reason about...

Let me take a closer look. Let me know if your production issue resolves with this fix though.

ok #729 should handle. Thanks for pointing this out, you were correct.

derekcollison added 2 commits August 25, 2018 14:33

Optimize sublist cache, add tests for cache contention

543d403

Signed-off-by: Derek Collison <[email protected]>

Use sync.Map for cache, fast version of literal test

d21ac8d

Signed-off-by: Derek Collison <[email protected]>

derekcollison requested a review from kozlovic August 26, 2018 00:55

kozlovic requested changes Aug 27, 2018

View reviewed changes

derekcollison added 2 commits August 27, 2018 12:28

Fix for wildcard cache addition

ab9e4c7

Signed-off-by: Derek Collison <[email protected]>

Move test into cache test, make sure it fails

ad3a150

Signed-off-by: Derek Collison <[email protected]>

kozlovic approved these changes Aug 27, 2018

View reviewed changes

Let cache sweeper run

2c4b7e7

Signed-off-by: Derek Collison <[email protected]>

derekcollison merged commit c926b56 into master Aug 27, 2018

derekcollison deleted the sublist_cache branch August 27, 2018 19:04

derekcollison mentioned this pull request Aug 27, 2018

Scalability with high cardinality of subscriptions and published message topics #710

Closed

gwik reviewed Aug 27, 2018

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sublist Shared Cache Improvements #726

Sublist Shared Cache Improvements #726

derekcollison commented Aug 26, 2018

coveralls commented Aug 26, 2018 •

edited

Loading

kozlovic left a comment

kozlovic Aug 27, 2018

derekcollison Aug 27, 2018

derekcollison Aug 27, 2018

derekcollison Aug 27, 2018

kozlovic Aug 27, 2018

derekcollison Aug 27, 2018

derekcollison commented Aug 27, 2018

gwik left a comment

gwik Aug 27, 2018

kozlovic Aug 27, 2018

gwik Aug 27, 2018

derekcollison Aug 28, 2018

derekcollison Aug 29, 2018

Sublist Shared Cache Improvements #726

Sublist Shared Cache Improvements #726

Conversation

derekcollison commented Aug 26, 2018

coveralls commented Aug 26, 2018 • edited Loading

kozlovic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

derekcollison commented Aug 27, 2018

gwik left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

coveralls commented Aug 26, 2018 •

edited

Loading